Multi-GPU Parallelization of the NAS Multi-Zone Parallel Benchmarks
نویسندگان
چکیده
GPU-based computing systems have become a widely accepted solution for the high-performance-computing (HPC) domain. GPUs shown highly competitive performance-per-watt ratios and can exploit an astonishing level of parallelism. However, exploiting peak performance such devices is challenge, mainly due to combination two essential aspects multi-GPU execution. On one hand, workload should be distributed evenly among GPUs. other communications between GPU are costly minimized. Therefore, trade-of work-distribution schemes communication overheads will condition overall parallel applications run on systems. In this article we present implementation NAS Multi-Zone Parallel Benchmarks (which execution alternate computational phases). We propose several strategies that try distribute Our evaluations show sensitive distribution strategy, as phases heavily affected by applied in phases. particular, consider Static, Dynamic, Guided schedulers find trade-off both maximize performance. addition, compare those with optimal scheduler computed offline using IBM CPLEX. evaluation environment composed 2 x Power9 8335-GTH 4 NVIDIA V100 (Volta), our parallelization outperforms single-GPU from 1.48x 1.86x (2 GPUs) 1.75x 3.54x (4 GPUs). This analyses these improvements terms relationship number increased. prove perform at similar schedulers.
منابع مشابه
The Nas Parallel Benchmarks
A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers. These benchmarks consist of five parallel kernels and three simulated application benchmarks. Together they mimic the computation and data movement characteristics of large scale computational fluid dynamics (CFD) applications. The principal distinguishing feature of these benchmarks is ...
متن کاملThe NAS Parallel Benchmarks 2.0
We describe a set of implementations of the NAS Parallel Benchmarks based on Fortran 77 and the MPI message passing standard. These implementations, which are intended to be run with little or no tuning, approximate the performance a typical user can expect for a portable parallel program on a distributed memory computer. They complement rather than replace the original NAS Parallel Benchmarks....
متن کاملTitle: the Nas Parallel Benchmarks
DEFINITION: The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers [?]. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-perf...
متن کاملThe NAS Parallel Benchmarks 2.1 Results
We present performance results for version 2.1 of the NAS Parallel Benchmarks (NPB) on the following architectures: • IBM SP2/66 MHz • SGI Power Challenge Array/90 MHz • Cray Research T3D • Intel Paragon "MILl, Inc. This work is supported through NASA Contract NAS 2-14303. tNASA Ames Research Center, Moffett Field, CA, 94035-1000. tSterling Software, Palo Alto, CA. This work is supported throug...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Parallel and Distributed Systems
سال: 2021
ISSN: ['1045-9219', '1558-2183', '2161-9883']
DOI: https://doi.org/10.1109/tpds.2020.3015148